Tidying Up International Nucleotide Sequence Databases: Ecological, Geographical and Sequence Quality Annotation of ITS Sequences of Mycorrhizal Fungi

نویسندگان

  • Leho Tedersoo
  • Kessy Abarenkov
  • R. Henrik Nilsson
  • Arthur Schüssler
  • Gwen-Aëlle Grelet
  • Petr Kohout
  • Jane Oja
  • Gregory M. Bonito
  • Vilmar Veldre
  • Teele Jairus
  • Martin Ryberg
  • Karl-Henrik Larsson
  • Urmas Kõljalg
چکیده

Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data processing can mask biology: towards better reporting of fungal barcoding data?

Fungal barcoding, that is the use of genetic markers to identify fungal species, has contributed enormously to the rise of mycorrhizal research in the last decade (van der Heijden et al., 2015) because it allows quick and easy en masse identification of species or higher taxonomic ranks and grouping of sequences into entities; this speeds up ecological analyses and the discovery of new species ...

متن کامل

PlutoF—a Web Based Workbench for Ecological and Taxonomic Research, with an Online Implementation for Fungal ITS Sequences

DNA sequences accumulating in the International Nucleotide Sequence Databases (INSD) form a rich source of information for taxonomic and ecological meta-analyses. However, these databases include many erroneous entries, and the data itself is poorly annotated with metadata, making it difficult to target and extract entries of interest with any degree of precision. Here we describe the web-based...

متن کامل

Nucleotide sequence of cDNA encoding for preprochymosin in native goat (Capra hircus) from Iran

Prochymosin is one of the most important aspartic proteinases used as a milk-clotting enzyme in cheese production. In the present investigation we report sequence of cDNA encoding goat ( Capra hircus ) preprochymosin and compare its nucleotide and deduced amino acid sequences with sequences of other ruminants preprochymosin. As bovine prochymosin, the caprine prochymosin cDNA encodes 365 amino ...

متن کامل

Protein Sequence Annotation in the Genome Era: The Annotation Concept of SWISS-PROT + TREMBL

SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Ongoing genome sequencing projects have dramatically increased the number of protein sequences to be incorporated into SWISS-PROT. Since we do not want to dilute the quality standards of SWISS-PROT by incorporati...

متن کامل

Towards a unified paradigm for sequence-based identification of Fungi Authors

The nuclear ribosomal internal transcribed spacer (ITS) region is the formal fungal barcode and in most cases the marker of choice for exploration of fungal diversity in environmental samples. Two problems are particularly acute in the pursuit of satisfactory taxonomic assignment of newly generated ITS sequences: (i) the lack of an inclusive, reliable public reference dataset, and (ii) the lack...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2011